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Abstract : 
s \ e 4 
The effects of two components of formative evaluation, (a) frequency 


of measurement and (b) data utilization, were compared in order to isolate 


formative evaluation components which teachers might routinely use to 


‘monitor achievement. Fifty-two learning disabled and educable mentally 


retarded students enrolled in regular class programs and receiving reading 
instruction in a special education resource room were randomly assigned 

to either (a) a Stexpasttwet non-data-based change group, (b) a-daily ° 
measurement non-data-based change group, (c) a daily sroturenent data-— 
based change group, or (d) an untreated control: group. Analysis of 


results of oral reading data: supported. daily measurement and. data-based 


’ changes as effective components of formative evaluation. | ; 


r 


narepe regulations promulgated under the Education aewi Handi- 

capped Children Ace of 1975, PL 94-142, require the development of an- 

Individual ducational Program (IEP) which specifies annual and short- 
‘ « ? ; ’ 

term objectives whenever a student is identified as requiring special 
education service. While logical arguments to support use of objectives 

in the development of educational programs have been proposed (Mager, 1962; 

Oe 


¥ 2h) / 
Popham & Husek, 69; Steiner, 1975; Tyler, 1950), empirical verification 


of the beneficial achLeverient effects of specifying objectives is lacking. 
_ Equal numbers of bittas can be found in which significant and iivohe gees 
ficant results are reported (Duchastel & Magrill, 1973; Hartley & Davies, 
1976). A major factor in these equivocal results may be the lack of 
adequate evaluation procedures iG a cuitee” eedener in effective decision 
making during the instructionahprogram (Crutcher & Hofmeister, 1975). 
‘ 

Traditionally, educational evaluation has'been oriented to placement 
and summative decision making. hile psychologists and educational diag- 
-nosticians routinet? use dtapanueie teatiag procedures which — forma- 
tive decision-making potential, these procedures are not the usual class- 
room practice. Review ile reteaching are the usual instructional deci- 
sions and relate only to items missed in ae post test. A study of . 
teacher decision making by ‘Zoharik (1975) supports this view. He found. 


planning decisions regarding evaluation, diagnosis, and instructional 
a * 


strategies were made by fewer than one-third ‘of the 194 teachers. studied. 


: . 


Similar findings were previously reported by Goodlad and Klein (1974) 
and Popham and Baker (1970). . 
Y 


Formative «evaluation is concerned ee the evaluation of educational 


programs still in some stage of development (Scriven, 1967). Unlike 


© : 
* ealesciag iin: combina Satie 


2 
placement and summative evaluation, formative evaluation ie Antedduite 
lead to the Ce instruction during the teaching procens it- sg 
self /) by providing feedback to Born teacher and student regarding ob- 
jective mastery (Conroy, 1973; ord 1972; STOW, to77; Sullivan, 1971; 
Sherman, Note 1). 

While, thera is considerable agreement that the a to improved in- 
struction and educational decision making by teachers may be formative 
evaluation progedures, the most effective components of a formative eval- 


aution system have-not been isolated or systematically compared (Sullivan, 


1971). Sullivan recommends identification of precise objectives in initial 
planning and the development of a detailed system a monieorditk and ran 
cording achievement of objectives as important to fie success of a forma- 
tive syatuetion system. Important soncenie seeattts however, regarding 

(a) the frequency of test administration required to make appropriate de- 
cistons during the instructional program, and (b) the way in’ which the 
collected data are utilized. 

Recomméndations regarding frequency of Measurement vary bie ee: 
pexivdte pre-post measurement approach described by Van Etten and Van 
Etten (1976) ae nortecontiauous ecueuee man, to the direct and daily 
continuous measuretlent approach advocateé by those who practice dhe 
technology of precision teaching (Alper & White, 1971; poten 19715, 
Kunzelman, Mat Lindsley, 1964; Lovitt, 1967; White & Haring, 1976: 
‘Haring & beiee: Note .?, Starlin, Note. 3, White & Liberty, Note'4). 

Ie is Anuar by proponents of: this view that only continuous jantene 


ment and analysis ef ne FAGEMANGE permits the teacher to make chines 


in the program when it will be-of the greatest benetlers the student 


‘ 


(Starlin, 1971). oie” “Sahat 


<t ¢ 5 
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/ <The issue of "data utilizat er also one which has not been 
adequately resolved-with mdepack to formative evaluation. One solution 
has been to establish a set of: rules which provides a seandéea method. 
for daily program analysis (Liberty, Notes 5 and 6). The rules attempt 


to take the "guesswork" out of analysis of daily measurement data by 


providing guidelines with respect to the length of time an intervention 


‘should be maintained for individual programs. The rules are determined 


not omly by the progress of. the student but by the objective (aim) of 


the program as well. The rules add an important dimension to formative 


. evaluation not addressed in the pre-posttest paradigm. 


A limited number of studies is teported in the research literature 


@ 


where attempts tive been made to systematically isolate effective compon- 


‘ 


ents of a formative evaluation system. — Jenkins, Mayhall, Peschka, and 
Townsend (1974) compared charted and non-charted feedback of daily measure- 
‘patie dues to teachers and students Gk siavidtead Susie which significantly ; 
favored the charted feedback group. Frumess (Note 7) compared different 
degrees of galt-manapenant when used with san iy: Henwupanane and found 
significant differences favoring students who charted their own aie 
scores compared to students for whom there was no self-charting or 
teacher ea of performance. 

In re tidedeieation of daily measurement and decision rules, 
Saharan (Note 8) compared teacher judgment as the predominant formative 


evaluation procedure with daily measurement and data decision rules: 


he reported resid tquwhich favored studentsS in the latter treatment. 
Xa, : 


“sw 


Of particular interest “Were findings which suggested: that for eirht ef 
; WR OY 


a 
ae 
sagen, 
- 


4° 
h ° | 
the 23 students in the study one minute of. daily measurement was suf- 


28 a2 ‘ 


j 


ficient to improve achievement, thus making it unnecessary “for the 


teacher to use any‘ decision rules to make program..adjustments. wife % 


The present study further expfored, Bohannon's findings by con- 


, trasting student achievement under conditions of ‘daily measurement and 
daily measurement with data decision rules. In addition, a third’ *reat- 
ment was infedaved in Which pre-posttest measurement was the only forma- 
tive evaluation procedures systematically taplenenced by the teacher. 
The research was designed to answer the following questions: 
' 1. Does daily measurement increase student performance on 
objectives beyond that attained with pre and Seeetaueiaus 
2. Does adding Rate util gatida component increase student 
performance on objectives beyond that attained with daily: 


w O- «a 
measurement alone? ° 2 : a 


Method 
Participants . ; 

Fifty-two aiilaren in grades two through six who had been classi- 
fied by the school placement team as learning disabled or educable men- 
tally retarded wd 13 special education resource teachers in four metro- 
politan school Alwketcen in Hininesata participated in the study. The 
students were enrolled fa teguier class programs. Daily reading instruc- 
tion however was provided in resource rooms ey the resource teachers. 
Treatments | 

Four students were randomly selected from each resource Reacher’ s 

_ existing caseload and randomly assigned to one of three experimental 


treatment groups or to’ an untreated control group. The 13 subjects 


ae 


5 


a 


in me treatment a then receivéd reading tnakcuctien which tneludea 

one of siicug cet antdare of formative evaluatién procedures: (a) pre- | 5 
post measurement, non-data-based change (PPN), (b) daily measurement, 
non-data-based charige’ (DMN), ‘or (c) daily measurement, data-based change 


KDED) Analyses of variance of PEEteRE performances sate tas no reliable 


. 
° 


dittavences between groups. ‘ 


Instruments 


a4 ra v 


Four types of data were used to analyze treatment effects. Measures ‘ 


of oral reading rate correct, oral readin ae incorrect, vocabulary . 

meaning, and comprehension er sieatiel tee afl studént&both prior to and . 
following treatment. The first three measures were derived from stories 
eundonly selected from Levels Ib, IIb, and IIIb of the Power Builder Kits 

; fSBA, 1863, 1969). Each student read orally for three minutes and ou 

asked to define five words which had previously been randomly selected _ a 
from the first 100 words of the erg The total “number of words ada \ 


correctly and incorrectly were then waited and divided by three to obtain 


‘ the per minute rate. The total number of words defined correctly was 
detevetnad by teacher judgment. When in doubt, the first definition in 
the dictionary ns used as the criterion. A measure of each student's 
reading comprehension was qbtatned using the Stanford Diagnostic Reading 
Test Lével I, Forms W and X (1968) and the comprehension subtest of the | 
Stanford Diagnostic Reading Test, Level II, Forms W and X (1966). Daily 
measures of oral bending correct jini incorrect and vocabulary meaning 

in the SRA Power Builder Stories also were obtained for students in. the 


’ * * , 
daily measurement ana data decision rule groups (N = 2€). 


. i} * = 3 oe - ‘ . ° ey 


Specific Procedures , As ‘ 


AIl subjects ane etaty were placed for reading instruction by. = 


the experimenter in. three levels of the SRA Power nk i Kits (1963, 1969). 


Placement was determined by identifying passages whitch. the student: could 


read orally at the rate of 50 to 75, 35 to 60, and 30 to 40 words correctly 


per minute, Error performance was not used in making the placement “eci~ 


, ston. nr ae ee Rs 4 
. Each student's, oral reading Performance space chicka 

| days at each of the three duvets ice reliably establish ‘initial dextucuneer <a - 
x 30 percent increase’ in oral reading rate correct was arbitrarily established A 
as the 18-day Shieseeys the. all students in the experimental treatments. The 
desired level’ at 18 auge was determined by multiplying the median initial | 
oral reading correct score at each level by a factor of 1.3. To establish 
daily é6bjectives for the daily measurement and data decision rule groups, a 
straight increasing daily aim line was drawn on an equal interval graph 

. ‘ ae 

connecting the median initial level with the desired level at 18 days, } ti 
(Liberty, Notes 5 and 6)., The daily objective for error rate was to re- 
main at or below the ihedian initial error rate. This objective was shown 
on the equal interval graph by drawing a straight line across the graph 
at the student's initial median error rate for each level. -An example of 
a graph with initial data points and daily aim line drawn for both correct 
and error rates is shown in Figure l. 


‘ ) 


. The sequence of instructtonal activicies for all groups was as * 


bi 2 


aa 
, 7 
_ Each student received 20 minutes of reading instruction daily 


¢ 


from the special education resource teacher. Instruction consisted of 
reading nine stories twice at each level over an 18-day period. Students 
read aloud’ for three ree at each of the three placement levels. 
Students vere then asked to define five words from each story. Error 
correction and word meaning correction were given. 

| ‘Each of the treatment groups differed from one’another with respect 
ee £he daily formative wihieeton procedures used as follows: 

Daily Measurement, Data-based Change (DMD): The teacher and 
student reviewed the graph each day to determine whether the dadly 
objective had been achieved. If daily data points were plotted’ below 
the aim line for two consecutive data days, a ful aim line was avai | 
parallel to the eriginal line (i.e., the target date was extended) and 
a program change was made. If the daily data points were plotted above 
the daily ain line for five consecutive days, a new daily aim line was 
drawn parallel se above the original line and a program change a 
made. Examples of original and rdataeen daily aim lines are shown in’ 
Figure 2. | 7 . 


recor data were also reviewed daily. If daily data points were 
plotted above the median error line for two data days, ae eg median 
error line-was drawn and a program change was made. If error’data 
were plotted below the median Line for tive -days, the same procedure 


was Followed. 


¥ | 
we ; a 
. 8 
The teachers made a series 6 ptontan changes as a function 
1 
of the student's performance. The changes were, in sequence: 
l. Each day a data point was plotted.on Se above the line 2 
the student .received a gum ball dispensed by placing a 
penny supplied by the teacher in a gum ball machine. 
2. Each day a gana point was plotted. on ay above the line 3 ; 
the student received a gummed sticker of his/her ‘choice. 
a 
3. Each day a date point was plotted on-or above the line 
, the student Paenivad a gummed dot which was placed on a ‘ 
’ . card® Five dots could be exchanged for a tangible item ™ 


* such as a book folder, an opportunity to work in the office 
“or operate the audiovisual equipment, or any other similar 


a . 


school activity based on individual interest. | 
Daily Measurement Non-data-based Change (DMN): Following timed 
oral reading, teacher and student marked the graph and checked to see 
whether the daily objective had been achieved. Teachers provided encotrage- 
ment with positive statements and praise. Whenever router changes were 
implemented for the DMD group, they were also implemented for this 


2 
group. \ 


Pre~post Measurement Non-data-based Change (PPN): Following daily 


oral reading teachers praised students and thanked them for reading. 


Whenever Program changes were implemented for the DMD group, they were ne, 
also implemented for this me oo 
Untreated ControMGroup (UC): This group came to the. resouree. 


~ room daily fer regular reading instruction of approximately the same . 
duraticn (20 minutes) as che experimental groups. No controls were 


—. 


= ae 7 i nist 


: . _* 9 
a a a: ¢ ; ‘ 
éxerted over the reading instruction of students in this’ group. 
_Immediately following the 18-day ingtruction period, the oral 
reading fluency and vocabulary meaning performance of ‘all students was 
ae? ~ again ee tueuac’ on three days: at each of the three levels in which initial ' 
perfaqrmance wag obtained. On the ehiey day, a measure of wexarig com- 
prehension was also obtained dbine cia Stanford Diagnostic Reading Test’ 
“Level I Form W (1968) for.students in grades two and three acid the Stan- 
ford Diagnostic Reading Test, Level II Form W (1968) fdr students: in - | 


grades four through six. 


- - Results 
The post treatment data for all groups on all measures appear in 
Table 1. One way analyses of variance were conducted on the posttest 


means and are shown in Tables 2 aod 3. : , 


Big, ecg lag Ne tase pres inche wa say aaa ne eke ; 
As can be seen, the differences among group means obtained following 
treatment were reliable‘at Independent. and Frustration Levels for. the 


( : oral reading correct measure, but not for the other dependent measures. 
. ae ‘ 


‘ 


A post hoc anadysis using a Student-Newman-Keuls procedure was conducted 


.and is presented in Table 4. 


The results of the paired comparisoas revealed that the DMD group 
performance exceeded the other three groups at both Frustration and » 


° a ‘ . 3 j r 


oo» 


~10 s 
Independent Levels with one exception.: The performance of the DMD group _ 
_ was apparently equal to that of the DMN group. at the Independent Level. 


= The post hoc analysis also revealed a difference between DMN and the UC 
Goss : 


group at the Indépendent Level. “ = ¢ 
Discussion . is 
‘ fre ee . e4 
- The results of the present study provide evidence that variations ob 


in teacher measurement BE and in how measurement data are used 
’ ‘to make program decisions can significantly influence student performance. 


Several noteworthy conclusions EHC ERIN formative evaliation-are sup- 

ported by the birained data. | | oe 

“9 The most’ important conclusion vhigh may be supported by. the results . 
ae. c be the present study is that systematic formative evaluation most effec- \ 


. tively: contributes to student achievement when rules for the utiliza- 
; tice na f 


. 


v ’ tion of measurement data are included as part of the formative evaluation 


; 


system. When teachers measured student oral reading performance daily 
ee relation to daily goals, and altered both goals and consequences con- 


“ . ae ae ' 
\. g achievement occurred. mueRe findings are consistent’ with the results: 


Mae 


a be é ‘ tingent: upon measured student “performance relative to goals, superior 


’ 


° aheeines ay other paaineciaes (e.g., eee Note 8). 
7 cae eet be recalled that students in the daily measurement treat- 
or 4 . Vy, ids ‘ * ” 
t_retetved exactly the same nunber* and type of program changes, on 


the same schedule, as ‘students in the data-utilization treatment, yet, ‘. 
. in only one case did their derfornancn exceed even the untrented con- 


trol group. In contrast, che datarvutilization group( exceeded the un" 


a 
a 


mS treated cont col ‘and the i daa ‘peop at both Frustration und 


Independent reading levels, . and Picnniea the d dadty meas suéement group at 
me i : 


‘ 
. 


Frustration Level.: 

We need to take ‘note that the data utilization treatment was a 
for altering goals and delivering consequences cannot: be determined. 
Teachers may’ be able to efficiently alter goals and deliver consequences 
if they are not required to use the particular SubarUnehieaetoe rules 


‘employed in the present study, or if oe are not required to measure 


daily. Our position is, however, that oa eat in formative evaluation 


¢ 


a ; is a determinate relation betweén neamitenetit data and program changes, 
) ‘and that foreutive evaluation consistently improves as duneovennate are 
made in measurement and the procedures for utilizing measurement data. 

The present results, we believe, givpeek the conclusion that daily meas- 
urement of student pectornance is an important pence ae of formative 


e- % 
. evaluation only when procedures for utilizing daily performance data 


are Bequered: i iz 7 i. 


2 a Sw 
@* « 


A weednd conclusion supported by* the data aneivais As that altera- 
tions: in formative evaluation procedures seem to ampece most directly _ 

i the behavior which is measured and used as the datos for ingtructional 
4 ~ + decision ‘making. ‘In this study, teachers devaceed oral meading rate, 


\used that. data to make changes in the daily oral reading rate’ goals, and 


ey . e 


in whether or what consequences were de ivered for: vachievine those deity: 
oral reading goals. Although teachers measured ail recorded student 


performance on vocabulary ae they did not set daily objectives 


, for this behav Lor and did: MOL, ae vocabulary moaning daca to make pro- 


* 
> 


? 
the oral reading correct data but not in Ene: voeabyaity meaning data | 


‘ complex treatment. The separate effects of daily measurement, and rules: 


gram ghanges. The resules Were that treatment effects were ‘revealed in 
: : : : 


»? 


- a s = ; 
what behaviors to measure. 


‘ , "A , 2 
) my ’ _ 6 


nor in the standardized comprehension measure. Although a daily aim 


was set for errors, the aim was essentially to maintain Mtial error | : 


ar : ; : 
rates rather than to decrease error rate, it is therefore also not 


id 


surprising that the initial equivalence among treatment groups for 


 tWis behavior remained at post testing. If as Eye Jecuane of this study 


suggest, the advantage of formative evaluation accrues primarily to - : 
the behavior that is measured and used to make instructional decisions, 
then considerable importance must be invested in decisions regarding 
a 
A-third conclusion regarding the effective components of_formative 
evaluation which may be derived from the PREHENE results is. that tradi- ‘ 


tional pre and posttesting,on a particular objective doae not contribute 


-to improved achievement. . Students whose performance in’ oral:reading was 


' t . 
measured initially and again at the end of treatment increased no more 


than students in the untreated control group. This finding‘is made all 
the more remarkable by/the fact that.the students in the pre and post 


’ 


test treatment actually systematically practiced oral PEE each day 


while the’ ‘students in the untreated control group did not. The er 


tance of this failure’ of pre and posttesting as an approach to formative 
evaluation is that it calls into question the purpose of the most per- ' 

vasive informal approach saad’ ty teachers to monitor student: achievement, = a 
A final comment regarding measuring student performance in ,reading , 


for purposes of formative evaluation should be made. One may argue that 


f , 3 
oral reading rate “in the basal reader is of quesrionable importsanee as 


an educational objective. Evidence is? available to the contrary, how- 


‘ 


ever., Oral reading, as a measure of decoding skill is highly related to 


{ “3 


~ a 15 


” te . ‘ * 13 
, reading aohievement and to comprehension (Deno, Chiang, Mirkin, & 


oe “uowry, Note 9). Oral reading performance, then, serves as a convenient 


“~ 


index of reading. proficiency. An\interesting finding in the present 


study. is that oral reading at Independent and Frustration Levels was more 


. 


sensitive to treatment effects. It may be that formative evaluation of 
reading requires regular measurement on ¢ontent external to daily in- , 
‘ » ° 


a, 3 ' gtruction. Further research on what to measure as a part of formative , ra 


evaluation in reading is required.: 


-t 


1. 


2 


3. 


5. 


6. 
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Footnotes, 
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pond cA the lower end of the independent, instruction’ and frustration - 


level rates recommended by Starlin (1973) when making sVacwanc Metie 
sions for primary and intermediate grade remedial students. 

2cince the data decision =ate treatment requires a program change 
whenever a student “does not achieve cha daily objective for two nae 


cutive data days, program changes were also implemented for students 


i in the her- Sroereaenl sy groups to control for the ideas that 
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differences — could be attributed to the program changes 


e formative evaluation procedures. 
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Table 1 's 


Posttest Means and Standard Deviations. for Each Group by Level on Four Dependent Measures 


Group 
: PPN DMN pM <Ue.  . 
‘ Xx sd X ad 6k sd isi sd 
_ Oral Reading Correct ct Cu Pee ; , 
Independent . 70.69 12.48 75.62 . 98.43 82.77 13.15 63.85 11.92 
Instruction ~~ 56.46 10.23 56.23 10.01 61.92 9.11 53.31 8.64 
Frustration =— 40.00 6.89 38.69 6.37 46.23 , 7.29 38.54 4.03 
Oral Reading Incorrect \ ; | . *® 
Independent ‘\ 4.47 1.622.093.3643 2.87 1.96 = 3.89 4.20 
_ Instruction ° | 4792 1.42 4.06 1.57 4.65 2.29 4.50 4.23 
Frustration 6.71 .1.57 5.60 2.20 6.28 2.56 6.94 3.86 
: : ; ® 
(Vocabulary Meaning ‘ , 
Independent ee 78.46. 19.08 76.92 35.45 92.31 10.13 86.15 17.10 
Instruction ' 9455.38 30.72 63.08 29.26 69.23 27.83 63.08 25.62: 
Frustration © 27.69 20.88 34.62 27.87 36.92 24.28 33.85 ° 30.97 
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Table 3 


Summary of the Analysis of Variance on 


' Posttest Means for Comprehension 
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Figure ' 


A GRAPH WITH BASELINE DATA POINTS AND DAILY AIM LINES DRAWN; : 
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| Figure 2 
ORIGINAL AND: REDRAWN DAILY AIM LINES 
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